Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
PDF
「不確実な時は楽観的に」(optimism in face of uncertainty)の原則
Sébastien Bubeck and Nicolò Cesa-Bianchi. (2012). "Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems". Foundations and Trends in Machine Learning, 5(1), 1-122.